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SHADING .3-OIMEN3I0NAL COMPUTER GENERATED IMAGES 

This invennion rslaces to the shading of 3- 
. dimensionaX computer generated images and to a mechod and 
apparatus for performing this. 
5 In our British Patent No. 2281682, there is described 

a 3-D rendering system for polygons in which each object 
in a scene co be viewed is defined as a set of surfaces 
which are infinite. Each elementary area of the screen in 
which an image is to be displayed has a ray projected 
10 through it from a viewpoint into the a-dimensional scene. 

The location of the intersection of the projected ray with 
each surface is then determined. From these intersections 
it is then possible to determine whether any intersected 
surface is visible at that elementary area. The 
15 elementary area is then shaded for display in dependence 
on the result of tne determination. 

The system can be implemented in a pipeline type 
processor comprising a number of ceils, each of which can 
perform an intersection calculation with a surface. Thus 
20 a large number of surface intersections can be ec-puted 
simultaneously. Each cell is loaded with a set ct 
coefficients definina a surface tor which it is zz perform 
the intersection test. 

A further improvement which is described in cur UK 
2S Patent Application No. 2298111 sub-divides the image plane 
into sub-regions or tiles, this proposes using a variable 
tile sice and projecting a bounding box around complex 
objects. This is done by firstly determining the 
distribution of objects around the visible screen for 
suitable tile sizes to be selected. The surfaces defining 
the various objects are then stored into one contiguous 
list. This avoids tne need to store identical surfaces 
for each tile, as one object being made of many s-^rfaces 
could be in a number of tiles. The tiles can then be 
rendered in turn using the ray casting technique described 
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abovo, one ac a time rendering all objeetjj wlchin that 
tile- Thi3 is nn efficienr nieT:h<:3d because no effort needs 
ZQ be made to render objects which are Known nor co be 
visible xn a particular tile. 

We have appreciated that the amount of processing can 
be reduced further if only data pertaining to portions of 
surfaces which are in fact visible is processed. Thus, in 
accordance with- a preferred emtoodiinent of the invention we 
provide a method for defining the edges of visible 
surfaces with planes which are perpendicular to the 
viewing direction. 

In accordance with a second aspect of the invention, 
we have appreciated that rather than use a variable tile 
size, the processing may bo optimised by using a regular 
tile siZQ across the whole of the image plane wherein the 
tile boundaries may intercept with objects but with no 
edge ciipp-.ng being necessary. A set of tiles can then be 
, selected which define a bounding box for a particular 
Object ar.a, in order to render that particular object, 
only rno tiles within that particular bounding box needs 
to br pro-ft£sed. A display list of the surfaces which 
fall Within chat tile is used to define objects within the 

A iur!.:ifer improvenient on this method discards the 
tiles Within a bounding box which do not actually contain 
the op^oct to be rendered, 

I'rerorrc-d embodiments of the invention wili now be 
described m detail by way of example with reference to 
the accompanying drawings in which: 

Figruire 1 shows a graphical representation of a 
triangular surface for use in representing a portion of an 
object; 

Fi^re z snows now tne posirive and negative sides cf 
the surfaces are used to determine the visible portion of 
the triangle of Figure If 
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Figure 3 shows scheroacicaily the edge processors and 
surface processors used to shade the triangle of Figure 2; 

Figure 4 shows schemacically a screen wich an objecc 
on it with the screen divided into an array of 25 tiles 
5 and with an object broken up into triangles in a 
conventional method; 

Piguro 5 shows schematically the object lists which 
are used in accordance with an embodiment of the 
invention; 

10 Figure 6 shows a triangle within a bounding box of 12 

tiles; 

Figures 7a, b, c and d show a number of different 
triangles with differem: bounding boxes and the different 
numbers of tiles required to display them; 
15 Figure 8 shows an optimised selection of tiles for 

the triangle in Figure 7d; 

Figure 9 shows a table which illustrates the tests 
used to determine the tiles in Figure 8 which are not 
required to display the triangle; and 
20 Figure 10 shows a rectangular set of tiles with a 

test point; and 

Figure 11 shows a blocK diagram of the circuits used 
to generate oounding Doxes. 

In our British Patent No. 2231682, the rendering 
as system summarised in the introduction of this 

specification xs described. We have appreciated that any 
object can be modelled as a set of triangles. Thus, these 
would be the infinite surfaces which would be processed in 
that patent. In that patent, the edges of the objects 
30 would comprise the intersections of the infinite surfaces 
and the relative depths of forward and backward facing 
surfaces used to determine whether or not a particular 
surface was visible (if a backwards facing surface is 
closer than a forwards facing surface then neither is 
35 visible at a particular pixel) . 
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We have approciaced chac processing can be improved 
by defining the sages of criangifrs by infinite surfaces 
which are perpendicular to the viewing poinc. Thus, for a 
crlangie, four eurfaceA are required^ one for i^he face and 
J nhrec for the edges, one per edge. 

Before a triangle can be rendered, ic is necessary to 
calculate the equations for each surface. These are 
calculated in a polygon setup unit from vertex data 
supplied by the application software. The equation for a 
10 perpendicular edge surface oetween two vertices and v^., 
shown in Figure 1 and which are located at (x.,y, and 
(x.,y:,zj, is defined by: 

<y? - yi)y- (k- - x.,)y -r (x^y, - y.,x,) = 0 

Which is of the form: 

IS Ax+By+C^O 

which i& Che equation of a plane surface. 

Hnen the equation has a posxcive result for 
particular xy values (pixel locations) then the xy 
location is on the forward facing side of the edge surface 
2C and when it has a negative value then the xy location is 
on the backward facing side of the surface. Thus, when 
all four equations representing the triangle in Figure 1 
have a positive value then the pixel position is within 
the triangle as illustrated in Figure 2. This rule holds 
2S true for any shape used in preference to a triangle, e.g., 
a quadrilateral , 

A preferred embodiment of the invention is shown in 
Figure 3, In this, there ia a polygon setup unit 2 which 
receives vertex data defining crxangles and supplies the 
JO facing surface da^a for the triangles to respective ones 
of the set of 32 surface processors 4. At the same time, 
for each triangle being processed by a surface processor 4 
it supplies tnrce sets of edge data to each on* of thre* 
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arrays of edge processors 6. These each comprise a depth 
evaluation onic which deLerinin© whether or hot. the value 
for the edge surface they are processing is positive or 
rtegative for each of 32 particular pixel locations. The 
outputs of iDach of these is a positive or negative sign 
bit and these sign bits for the three surfaces is supplied 
to Che appropriate surface processor for that triangle. 
When all of the sign bits are positive as described above, 
then that surface processor knows that the triangular 
surface ic is processing is visible^ that is to say it is 
noc outside the edge of che triangle and thus, it will 
supply a depth value as an oui:put which will go tc a depth 
store, after which further cests can be made on it to 
determine whether or not it is to be used to make a 
contribution to the image being processed. If one of the 
sign bits is negative then the surface processor 4 does 
not need to do anything. 

The edge processors operate in the x direction, i.e., 
along a scan line in an image and, in a system which uses 
an array of 32 surface processors 4, will typically 
operate in a tile based system processing blocks 32 x 
32 pixels. The input value to each edge processor will 
therefore oe equivalent to By + C. The edge processor 
uses an inaccurate non-restoring division algorirhm which 
operates on the edge of the triangle. This algorithm 
effectively calculates 

A 



This is possible because the y value is constant for a 
particular value of h and thus by t C is a constant along 
a particular scan line. 

Tabla 1 shows the arithmetic operation involved in 
calculating the position of a transition point from inside 
HO oucside (positive to negative depth) of an edge. 
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C, (-A) - C, 





TAbJ.o 1 - InncouraM Hon-Raa corona Ds-viaien m an »rBe«»««or 



The operation performed in stage lA effectively moves 
10 Che sample point ro the middle in terms of x. This is 

possible because the setup unit moves the origin location. 
{x,y) = (0,0} to the top lefchand corner of the tile. The 
operation column indica'ces the test performed to calculate 
whether an addition or subtraction should be performed on 
IS Che accumulated C value in the next clock cycle. These 

tests are essentially a form of binary search^ where each 
addition/subtraction moves us closer to the zero crossing 
point. Tor example, say chat the 0 transition is at 13. 

X location 



20 Starr c * -ve A i- i-ve 0 

Add 16 C - +ve 16 

Sub 8A C " -ve 8 

Add 4A C = -ve 12 

Add 2A C = +ve 14 

2- Sub A C - 0 {+ve) 13 

Sub A 12 



The sign of the addicions/subtractlons which are 
performed by the edge processor are u^ed to calculate cho 
transition point or edge. Once this pixel position has 
:;0 been aetermined, it can be then used to create a mask for 
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a whoXe line of che tile. This mask represents a 
positive/negaca.ve depth value for each pixel within the 
line. The operation may toe pipelined using the arrays of 
depth processors referred co above so that an edge mask 
5 for a line of pi^cel^ within a tile can be created every 

clock cycle. As explained above the y coefficient for the 
edge equation is accumulated into constant C before the 
edge is processed. This allows an edge mask for a 
complete tile of 32 x 32 pixels to be generated over 32 
10 clock cycles where h is the height of the tile. 

The masks for all three edges in the triangle are 
ANDed together to create a depth mask for the triangle. 
The signs of the accumulated depth at the pixel position 
is passed to che surface processors 4. When che depth is 
IS positive the surface is visible. Thus, using this method 
a triangle can be processed ac the same speed as a single 
surface. Clearly, if four edge processors or more were 
available then quadrilaterals and other more complex 
shapes could be processed. 

When the screen of che image is divided into a 
plurality of tiles che current hardware implementations 
require all objects within the scene to be processed for 
each tile. This is inefficient since it means that all 
the tiles have to be processed for all the objects. 

In conventional rendering systems, rendering of the 
screen on a tile by tile basis requires objects to be 
clipped to tile boundaries, and therefore data defining 
the intersections with tile boundaries has to be defined 
(see Figure 4J . 

0 It is only necessary to process the objects which 

intersect with a particular region area. As explained 
above, if an object is defined in screen space, tnen a 
comparison of the vertices which define the object, such 
as a triangle, will yield a bounding box for that object. 

5 A bounding box defines a rectangular area within the 

screen which contains the object. Figure 4 shows a tiled 
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region of che screen with an object represenred t>y a 
number of triangles within iz. 

A bounding box for a parcicuiar objecc can be aliened 
to cile boundaries so that a lisc of T:iles within the 
h bounding box can then bo obtained. This list of tiles is 
a subset o£ all Che tiles within the screen and 
approximates the cil«s which intersect with the object. 
In the event that the bounding box with the object 
intersects with the whole of the screen area, then the 
la object parameters (coordinates, shading data, etc) are 
written into an area of memory wichin the system and a 
pointer to the suart of the object data is generated. 

The present rendering system operates on a tile by 
tile basis, processing the objects for each tile before 
15 progressing onto the subsequent one. The data structure 
is therefore used to identify the objects which must be 
processed for each tile. This is shown in Figure 5. In 
this, a list of tiles within the screen is created in a 
region or tile array 30. Each tile is defined by x and y 
20 limits. For each tile, a list of pointers to objects 

which must be processed for chat tile is generated as an 
object list 32. there is a separate object list for each 
tile pointed to by the region array. The bounding box 
idea is used to create a lisc of tiles (with object lists) 
,1^ that the object pointer, which is croaced when data is 
written to memory, must be added to. However, the 
hardware noeds to identify the tail of each object list so 
that an address for the object pointer to be written to 
can bo derived. The most simple method of doing this is 
20 to. store a tail pointer which points to the next free 

location on the list. This can b© a header in the object 
list. 

An enhancement of this is to use a cache which can.be 
a smaller size. The cache stores a sub-set of the tail 
35 pointers. As an object will generally cross multiple tile 
boundaries, a miss upon the cache results in multiple tail 
pointers being read in and predicting the tiles which the 
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ob^ecL traverses. This increases the efficiency of the 
cache- . This also enabi&s multiple images to De tiled at 
tho same tinio oy interleaving the object data and changing 
the cache contents. This switching involves scoring the 
'■■ contents of the tail pointer cache, adjusting the area of 
memory linked to the cache and the area of memory used for 
storin:: objects. The effect of the context is now 
changed. ThJt as to say, the cache is invalidated and a 
different sei of data is now available to bo tiled. To 
10 switch context taack is the reverse operation and involves 
storage cf tn-f* now context, the reversion of cache and 
object nemorv locations, and the invalidation of current 
cache . 

The ir:srrr4,«,ion for the object lists is now 
15 available. Ar; ^daress for the pointer which COmes from 
the tail poinr.*r cache and an object pointer pointing to 
an object ur.ict'. has a bounding box intersecting with that 
tile. All tr.e object lists entries for the object being 
processed can then be written to memory and the next 
20 object procossed. 

This inplemonted using the circuitry of figure 10. 
In this, object data is received from the application 
program in Lr.t- rcrm of triangles, fans, strips and points. 
Initially tfvi- •:,I.^^oct data is ail converted into strips, in 
35 a conversion unit 40. These are efficient in their memory 
usage. The converter 40 comprises a converter 42 for 
converting fans and faces to strips and converter 44 for 
points and lines lo strips. The strip data is then 
provided to a ijounding box generator 4 6 which calculates 
30 the bounding bo: lor each triangle within the strip, and 
the bounding box for the whole strip. if the bounding 
box intersects with the screen area the object data is 
written to memory via local read/wrice arbiter <iQ , 
scarcing from che next available location. Otherwise the 
i£ system moves on to the next strip. The address which this 
data is written to is passed down the pipeline. 
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A region crenerator 50 rftceives nhe bounding box 
infor^nacion and generates a mask apd tile identity for 
eacn Tiile within ^nc oounaing box for the whole strip. 
The Lxle identity is used uo access a tail pointor cache 
5 52 to read the next available pointer location. If this 
is the last address within the block, a new pointer block 
is allocated for this tile, and a link from the current 
block to the new one is generated. 

A write request to a free address for the pointer, 

Ifl with the object address, and the mask for that object is 
placed into a queue. The tail pointer for tho tile is 
then updated through the cache wich the next available 
pointer. When there are sixteen entries in the write 
queue, the requests are sorted by page address, by the 

li pointer sorter 54. These are written into the memory in a 
first; access, this reduces the number of page breaks to 
the memory. 

The most common type of cheap block RAM is DRAM, 
This is structured in pages and accesses which traverse 
20 pages. This is because there is a performance coat due to 
Closing the current page and opening a new page. However, 
writing a pointer to the same object into multiple lists 
involves a large number of page transitions as each list 
may be on a different page. However, it is probable chat 
as there will be a similarity between one incoming object and 
the next object. This means that the next object is 
likely to be placed in similar object lists as current and 
previous objects. Wich an object list, the addresses are 
essentially sequential and it is therefore desirable to 
JO write as many pointers within th© same list at the same 

time as there is address conerency between pointers, this 
may be achieved by storing a numbor of pointers (e.g. over 
a range of objects) and sorting them into page groups 
before writing them to memory. This greatly reduces ths 
15 number of page transitions and therefore increases the 
efficiency of the system. 
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For a given objecc daca sec for an image, ic is not 
possible zo derermme zhe number of objects which will be 
visible in each tile. The worst case scenario is chat ic 
will be necessary co allocace enough memory for d poincer 
o CO every object for every tile. This would require a 

large aniounc of memory and increase the syscem cost. This 
can be reduced by allocacing blocks for each object: lisc 
and, when a -block has been filled, allocacing a new block 
and inserting a link co che new block. This means chac 
10 the memory used is closer to the minimum amount required 
for object lisc storage. The size of a block will depend 
upon many factors such as the width of the memory and the 
available bandwidth. 

In order co reduce both the number of object pointers 
15 and the size of object daca further, another aspect of 

object coherency can be used. Since generally a group of 
triangles will be used to represent a larger object such 
AS A Lea-pot or sphere or animal, etc., there will be a 
large amount of commonality between triangles, i.e., the 
2C trianqles will share vertices between chem. By comparing 
znet vortices against each other, ic is possible co convert 
crianglos to strips. A strip takes up less area of memory 
as only one or two vertices are required to define a new 
sriar.iriie ana only one poincer is then required to point to 
26 all the objects in the atrip. This reduces the number of 
object pointers even further and also reduces the amount 
Of memory required, chereby resulting in an increase in 
efficiency in terms of memory and a performance increase 
due to bandwidth opcimisacions . In Figure 6 there is 
30 illustrated a triangle and a bounding box, this being the 
shaded portion. When it is processed using conventional 
methods, the region within which it falls covers a 5 5 
array of tiles and it would be necessary to process it 25 
times. However, if the image is first processed using a 
35. bounding box, co define the region which holds the range 
, of x,y coordinates used by the triangle, it can be shown 
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that the triangle only needs to be processed 12 cinies, 
i.e., it covers 12 tiles. 

We have further appreciated chat in fact the triangle 
only fails within 10 of the cilea in the 4x3 array. 
Thus reducing further the? processing overhead. 

Other examples of triangles which do not cover the 
whole of the rectangular bounding box required to process 
them are snown in Figures '/a-d. The most e;<treme example 
of these is Figure Id in which the triangle shown only in 
fact falls in the 12 tiles illustrated in Figure 8. it 
will be preferable to process only this sec of tiles in 
order to render that triangle. 

The calculation of the minimal sec of tiles to 
represent a triangle begans with the crude rectangular 
IS bounding box calculation. if the bounding box is only a 

single tile in either height or width, there- is clearly no 
further optimisation that can be performed. Otherwise the 
set will be reduced by consideration of each edge of the 
triangle in turn. 

Firstly, it is necessary to know whether the triangle 
is defined by a clockwise or (cw) anti-clockwise (acw> set 
of points. If this information is not available, it can 
easily be calculated. 

An edge can then be considered to be an infinitely 
2a long line which divides the space into two halves. Sub- 
Spaces on either side of the edge are described as being 
inside or outside the edge using the edge processors 
described above with the inside sub-space being the one 
that contains the triangle to which the edge belongs. The 
triangle has its corners at the intersections of the edge 
lines and the surface is the intersection of the inside 
sub-spaces of the three edges. 

Any tile that lies entirely on the outside of an edge 
is not part of the minimal set because the triangle would 
not be visible in that tile. If an edge is entirely 
horizontal or vertical it need not be considered since all 
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the tiles in rhe rectanguidr bounding box already lie 
wholly or partly inside the edge. 

In order to cest whether a tile lies wholly on the 
outside of an edge, we need only cest the point on that 
5 corner of the tile which is closest to the ecage. if chat 
point is on the outside of the edge, then we can be 
confident nhat the entire tile is also outside the edge. 
The position of this test point is determined by che 
orientation of the edge as indicated in the table given in 
10 Figure 9. 

The edge itself can be described using the equation 
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y * mx + c 

where x and y are coordinates of the screen, m represents 
the gradient of the line, and c is a constant. The 
valuation of mx + c at the corner of a tile will give a 
value that is greater than, less than or equal to the y 
coordinate of that point. The comparison of the two 
values will indicate whether the point lies on the inside 
or outside of the edge. The interpretation of this result 
depends on the orientation of the edge as given in the 
table in Figure 9. 

For each edge of che triangle, each tile in the 
rectangular bounding box must be processed in this way to 
decide whether or not it should be excluded from the 
minimal set. 

It should be noted that the test point at the corner 
of a tile is also the test point for a larger rectangular 
set of tiles. In Figure 10 r knowing that the tile marked 
where the test point is outside an edge" means that oil the 
shaded tiles must also be outside that edge. in this 
example, where the test point is at zhe bottom right, it 
is most efficient to process the tiles of the rectangular 
bounding box from right to left and from bottom to top, in 
all there is a large number of tiles may be excluded from 
the minimal set with the minimum number of tests. When 
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rhe test point is in a different corner of the tiie, the 
order of processing would be changed accordingly. 
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1. A mechod for shading 3-'di.mensional computer 
generated images comprising che seeps of: 

representing each object in the image by a set 
of flat polygons; 

for each polygon, supplying a set of vertex 
data defining che vertices of the polygon, along with data 
defining the orientation of its surface,- 

computing edge data from the vertex data, the 
edge data being equivalent to data defining a surface 
perpendicular to. a viewpoint and facing towards the 
polygon/ 

for each pixel used to view the portion of the 
image in which the polygon is located, deriving a depth 
IS value for each surface/ and 

shading that pixel when the depth values 
indicate that the polygon is visible at that pixel. 

2. A method accoraing to claim 1 in which a 
surface is visible at a particular pixel when its depth 

2C value is positive value and each of the edge surfaces have 
positive depth values at that pixel. 

3 , A method according to claim 1 including the 
step of determining edge transitions from the edge data, 
and deriving an edge mask for each edge. 

* 4. A method according to claim 3 in which the edge 

mask is derived line by line of pixels, one line per clock 
cycle. 

5. A method according to claim 4 in which the edge 
mask is derived line by line for each of a plurality of 
) rectangular sub-regions of the image. 
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6. A method according co ciaim 4 or 5 in which the 
edge mask is derived for each surface and includes chs 
step of deriving a depth mask for nhat surface frcm the 
edge masks. 

1' Apparatus for jshading 3'diinensionai computer 
generated images compriaing;- 

means for representing each object in che image 
by a set of flai; polygons; 

means for supplying a set of vertex data 
defining the vertices of each polygon, along with data 
defining the . orientacion of its surface- 
means for computing odge data from the vertex 
data, Che edge data being equivalent to data defining a 
surface perpendicular to a viewpoint and facing towards 
the polygon; 

means for deriving depth data for each surface 
for each pixel used to view che portion of che image in 
which the polygon is located; and 

means for shading that pixel when the depch 
values indicate that the polygon is visible at that pixel. 

8. Apparatus according to claim 7 in which a 
surface is visible at a particular pixel when ics depth 
value is positive and each of the edge surfaces 
surrounding it- has a positive depth value at that pixel. 

^' Apparatus according to claim 7 including means 
for determining edge transitions from the edge data, and 
means for deriving an edge mask for each edge from the 
edge transitions. 
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10. Apparatus according to claim 9 in which the 
edge mask is derived line by line of pixels in tne 
imagine, one line per .clock cycle. 
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11. Apparatus accoraing to claim 10 in which the 
edge mdisk is derived line by line for each of a plurality 
of rectangular sub-regions of the image. 



the edge mask is derived for each edge defining a surface 
and further comprising means for deriving a depth mask for 
that surface from the edge mask. 

13. A method according to any of claims 1 to 6 in 
which no edge clipping is required when a polygon 
intersects a boundary of the image being shaded. 



12. 



Apparatus according to 



claim 10 or 11 in which 
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